**************************************************************************************
Title: Borderline Democracy? The Electoral Consequences of the 2021 State of Emergency
on the Poland-Belarus Border
Authors: Paweł Charasz and Anil Menon
Contact: Paweł Charasz <pawel.charasz@gmail.com>
This Version: 2025-May-31
**************************************************************************************

This repository contains the code and data needed to replicate results reported in the paper and the online appendices. Successful replication of the code requires setting the current working directory to the location of this README file. 

The repository contains the following files:

1.  README.TXT
2.  replication_code.do - code needed to build the complete dataset and replicate all analyses
3.  Codebook.doc - codebook listing and describing variables in the dataset
4.  Data/polling_stations_data2015.dta - data with 2015 polling stations
5.  Data/polling_stations_data2019_geocoded.dta - data with geocoded 2019 polling stations
6.  Data/polling_stations_data2023_geocoded.dta - data with geocoded 2023 polling stations
7.  Data/Commuters/commuter_flows_matrix.xlsx - data with inter-municipality commuter flows
8.  Data/Commuters/population2021.dta - data with municipality population
9.  Data/Crosswalks/crosswalk_2015_2019.dta - crosswalk file for matching 2015 and 2019 polling stations in the main sample
10. Data/Crosswalks/crosswalk_2015_2019_ukraine.dta - crosswalk file for matching 2015 and 2019 polling stations in the placebo sample
11. Data/Crosswalks/crosswalk_2019_2023.dta - crosswalk file for matching 2019 and 2023 polling stations in the main sample
12. Data/Crosswalks/crosswalk_2019_2023_ukraine.dta - crosswalk file for matching 2019 and 2023 polling stations in the placebo sample
13. Data/Elections/results_sejm2015.dta - results of 2015 Sejm elections by polling station
14. Data/Elections/results_sejm2019.dta - results of 2019 Sejm elections by polling station
15. Data/Elections/results_sejm2023.dta - results of 2023 Sejm elections by polling station

***********************
* REPLICATION:
***********************

To replicate the analyses, run replication_code.do through Stata. 

***********************
* SOFTWARE AND HARDWARE REQUIRENMENTS
***********************

The analysis was performed on a personal computer using Stata 18.0 SE (Update level: 26 Feb 2025) and Python 3.13.2 running on a Windows 11 operating system. In addition to the base versions of Stata and Python, replication requires additional packages and modules listed below. 

***********************
** Stata packages
***********************

To run the replication code, the following Stata packages are needed:

coefplot (Distribution-Date: 20230225)
ebalance (Distribution-Date: 20150130)
estout (Distribution-Date: 20230212)
grstyle (Distribution-Date: 20200919)
rdlocrand (Distribution-Date: 20220621)
rdmse (Distribution-Date: 20230516)
rdrobust (Distribution-Date: 20220930)
texdoc (Distribution-Date: 20180418)

To install these packages, run the following in Stata:

	ssc install coefplot
	ssc install ebalance
	ssc install estout
	ssc install grstyle
	net install rdlocrand, from(https://raw.githubusercontent.com/rdpackages/rdlocrand/master/stata)
	ssc install rdmse
	ssc install rdrobust
	ssc install texdoc

***********************
** Python modules
***********************

To run the replication code, the following Python modules are needed:

matplotlib (3.10.1)
numpy (2.2.3) 
pandas (2.2.3)
rdlocrand (1.0.5)
statsmodels (0.14.4)

To install these modules using pip, you can type the following in the command line:

	py -m pip install matplotlib==3.10.1
	py -m pip install numpy==2.2.3
	py -m pip install pandas==2.2.3
	py -m pip install rdlocrand==1.0.5
	py -m pip install statsmodels==0.14.4

***********************
* ADJUST NUMBER OF REPLICATIONS TO SPEED UP CODE
***********************

When conducting randomization inference, we base our estimates on 10,000 replications. Given the extent of the analyses, 10,000 replications may consume too many computational resources. To replicate the analyses with fewer replications, you may adjust "global reps = 10000" in line 18 of replication_code.do to a smaller number, for example, 1,000. From our experience, using 1,000 replications yields highly comparable estimates. 

Replication time on our PC
1,000 replications: 5.5 hours
10,000 replications: 51 hours

***********************
* TROUBLESHOOTING
***********************
** Associate Stata with Python
***********************

For the code to execute correctly, you need to associate Stata with Python. Check in Stata if you associated Stata with Python (and that it is the correct version):
	
	python query

If you have not associated Stata with Python, you can do it by typing:

	set python_exec path_to_python.exe

where path_to_python.exe is the path to the file python.exe on your machine. 

***********************
** Stata: error with "import numpy" - "ImportError: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch your python interpreter from there."
***********************

One possible reason why "import numpy" fails in Stata is that NumPy requires the Microsoft Visual C++ Redistributable to properly load NumPy's C extensions. If you do not have it, try installing it from here: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170
